SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: NI, JIAN 
(ii) TITLE OF INVENTION: HUMAN TUMOR NECROSIS FACTOR RECEPTOR 



(iii) NUMBER OF SEQUENCES: 15 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: HUMAN GENOME SCIENCES, INC. 

(B) STREET: 9410 KEY WEST AVENUE 

(C) CITY: ROCKVILLE 

(D) STATE: MD 

(E) COUNTRY: US 

(F) ZIP: 20850 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: BROOKES, ANDERS A. 

(B) REGISTRATION NUMBER: 36,373 

(C) REFERENCE/ DOCKET NUMBER: PF379PP2 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (301) 309-8504 

(B) TELEFAX: (301) 309-8512 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 3566 base pairs 

(B) TYPE: nucleic acid 

. <C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 109.. 1266 
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(ix) FEATURE: 

(A) NAME/KEY: sig_peptide 

(B) LOCATION: 109.. 271 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 274.. 1266 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
CGACCCACGC GTCCGCCCAC GCGTCCGGAG AACCTTTGCA CGCGCACAAA CTACGGGGAC 60 

GATTTCTGAT TGATTTTTGG CGCTTTCGAT CCACCCTCCT CCCTTCTC ATG GGA CTT 117 

Met Gly Leu 

Cm -55 

n I GG G ? A AGC GTC CCG ACC GCC TCG AGC GC ? CGA GCA GGG CGC TAT 165 

U Trp Gly Gin Ser Val Pro Thr Ala Ser Ser Ala Arg Ala Gly Arg Tyr 

*L" "50 -45 _4o 

i : i 

U CCA GGA GCC AGG ACA GCG TCG GGA ACC AGA CCA TGG CTC CTG GAC CCC 213 

Pro Gly Ala Arg Thr Ala Ser Gly- Thr Arg Pro Trp Leu Leu Asp Pro 

fy "35 -30 -25 

U f* 6 A T C CTT ^ TTC GTC GTG TTG ATC GTC GCG GTT CTG CTG CCG GTC 261 

Hi X S n Ile LeU Lys Phe Val Val Phe Ile Val A1 a v al Leu Leu Pro Val 

-2° -15 -io _ = 
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V 3 CGG GTT GAC TCT GCC ACC ATC CCC CGG CAG GAC GAA GTT CCC CAG CAG 

Arg Val Asp Ser Ala Thr Ile Pro Arg Gin Asp Glu Val Pro Gin Gin 
1 5 io 
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ACA GTG GCC CCA CAG CAA CAG AGG CGC AGC CTC AAG GAG GAG GAG - TGT 357 
Thr Val Ala Pro Gin Gin Gin Arg Arg Ser Leu Lys Glu Glu Glu Cys 
15 20 25 

CCA GCA GGA TCT CAT AGA TCA GAA TAT ACT GGA GCC TGT AAC CCG TGC 405 
Pro Ala Gly Ser His Arg Ser Glu Tyr Thr Gly Ala Cys Asn Pro Cys 
30 35 40 

ACA GAG GGT GTG GAT TAC ACC ATT GCT TCC AAC AAT TTG CCT TCT TGC 453 
Thr Glu Gly Val Asp Tyr Thr Ile Ala Ser Asn Asn Leu Pro Ser Cys 
45 50 55 \ Q 

CTG CTA TGT ACA GTT TGT AAA TCA GGT CAA ACA AAT AAA AGT TCC TGT 501 
Leu Leu Cys Thr Val Cys Lys Ser Gly Gin Thr Asn Lys Ser Ser Cys 
65 70 75 

ACC ACG ACC AGA GAC ACC GTG TGT CAG TGT GAA AAA GGA AGC TTC CAG 54 9 

Thr Thr Thr Arg Asp Thr Val Cys Gin Cys Glu Lys Gly Ser Phe Gin 
80. 85 90 

GAT AAA AAC TCC CCT GAG ATG TGC CGG ACG TGT AGA ACA GGG TGT CCC 597 
Asp Lys Asn Ser Pro Glu Met Cys Arg Thr Cys Arg Thr Gly Cys Pro 
95 100 105 



AGA GGG ATG GTC AAG GTC AGT AAT 
Arg Gly Met Val Lys Val Ser Asn 
110 115 

TGC AAA AAT GAA TCA GCT GCC AGT 
Cys Lys Asn Glu Ser Ala Ala Ser 
125 130 

GAG GAG ACA GTG ACC ACC £TC CTG 
Glu Glu Thr Val Thr Thr He Leu 
145 ' 

TAC CTT ATC ATC ATA GTG GTT TTA 
Tyr Leu He He He Val Val Leu 
160 

GTT GGC TTT TCA TGT CGG AAG AAA 
Val Gly Phe Ser Cys Arg Lys Lys 
175 180 

TGC TCA GGT GGT GGA GGA GGT CCC 
Cys Ser Gly Gly Gly Gly Gly Pro 
190 195 

CGG CGG CGT TCA TGT CCT TCA CGA 
Arg Arg Arg Ser Cys Pro Ser Arq 
205 

CGC AAC GAG ACC 
Arg Asn Glu Thr 



TGT ACG CCC 
Cys Thr Pro 



TCC ACT GGG 
Ser Thr Gly 
135 

GGG ATG CTT 
Gly Met Leu 
150 

GTC ATC ATT 
Val He He 
165 . 

TTC ATT TCT 
Phe He Ser 



GAA CGT GTG 
Glu Arg Val 



GAG CAG 
Glu Gin 



GTA GAG 
Val Glu 



'GAA GGG 
Glu Gly 
270 

TCC GCT 
Ser Ala 
285 

GGA CAT 
Gly His 



GAA ATC 
Glu He 
240 

TCG CCA 
Ser Pro 
255 

TGT CAG 
Cys Gin 



TGT CCT 
Cys Pro 
210 

CTG AGT 
Leu Ser 
225 

CAA GGT 
Gin Gly 



GAG GAG 
Glu Glu 



AAC AGA 
Asn Arg 



CAG GAG 
Gin Glu 



CCA CAG 
Pro Gin 
260 



GTT CCT GGG 
Val Pro Gly 
215 

TAC TTG CAG 
Tyr Leu Gin 
230 

CTG GCA GAG 
Leu Ala Glu 
245 

CGT CTG CTG 
Arg Leu Leu 



CGG AGT GAC ATC AAG 
Arg Ser Asp lie Lys 
120 

AAA ACC CCA GCA GCG 
Lys Thr Pro Ala Ala 
14.0 

GCC TCT CCC TAT CAC 
Ala Ser Pro Tyr His 
155 

TTA GCT GTG GTT GTG 
Leu Ala Val Val Val 
170 

TAC CTC AAA GGC ATC 
Tyr Leu Lys Gly He 
185 

CAC AGA GTC CTT TTC 
His Arg Val Leu Phe 
200 

GCG GAG GAC AAT GCC 
Ala Glu Asp Asn Ala 
220 

CCC ACC CAG GTC TCT 
Pro Thr Gin Val Ser 
235 

CTA ACA. GGT GTG ACT 
Leu Thr Gly Val Thr 
250 



AGG AGG 
Arg Arg 



GAC ATC 
Asp He 



GCA AAG 
Ala Lys 



CTC TTT TAT GAA 
Leu Phe Tyr Glu 
320 



AGC ACC 
Ser Thr 
290 

GAA ACA 
Glu Thr 
305 

GAA GAT 
Glu Asp 



AGG CTG 
Arg Leu 
275 

TTG CTG 
Leu Leu 



CTG. GTT CCA 
Leu Val Pro 



GAA CAG GCA 
Glu Gin Ala 
265 

GTG AAT GAC 
Val Asn Asp 
280 



GAA GCT 
Glu Ala 



GCT GAC 
Ala Asp 



ATT CAG 
He Gin 



GAG GCA 
Glu Ala 



GAT GCC TCG 
Asp Ala Ser 
295 

GAC CAA CTG 
Asp Gin Leu 
310 

GGC TCT GCT 
Gly Ser Ala 
325 



GCA ACA CTG 
Ala Thr Leu 



GTG GGC TCC 
Val Gly Ser 



ACG TCC TGC 
Thr Ser Cys 
330 



GAA GAA 
Glu Glu 
300 ■ 

GAA AAG 
Glu Lys 
315 

CTG 
Leu 
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TGAAAGAATC TCTTCAGGAA ACCAGAGCTT CCCTCATTTA CCTTTTCTCC TACAAAGGGA 1326 
AGCAGCCTGG AAGAAACAGT CCAGTACTTG ACCCATGCCC CAACAAACTC TACTATCCAA 1386 
TATGGGGCAG CTTACCAATG GTCCTAGAAC TTTGTTAACG CACTTGGAGT AATTTTTATG 1446 
AAATACTGCG TGTGATAAGC AAACGGGAGA AATTTATATC AGATTCTTGG CTGCATAGTT 1506 
ATACGATTGT GTATTAAGGG TCGTTTTAGG CCACATGCGG TGGCTCATGC CTGTAATCCC 1566 
AGCACTTTGA TAGGCTGAGG CAGGTGGATT GCTTTGAGCT CGGGAGTTTG AGACCAGCCT 1626 
CATCAACACA GTGAAACTCC ATCTCAATTT AAAAAGAAAA AAAAGTGGTT TTAGGATGTC 1686 
ATTCTTTGCA GTTCTTCATC ATGAGACAAG TCTTTTTTTC TGCTTCTTAT ATTGCAAGCT 1746 
CCATCTCTAC TGGTGTGTGC ATTTAATGAC ATCTAACTAC AGATGCCGCA CAGCCACAAT 1806 

GCTTTGCCTT ATAGTTTTTT AACTTTAGAA CGGGATTATC TTGTTATTAC CTGTATTTTC 1866 

AGTTTCGGAT ATTTTTGACT TAATGATGAG ATTATCAAGA CGTAGCCCTA TGCTAAGTCA 1926 

TGAGCATATG GACTTACGAG GGTTCGACTT AGAGTTTTGA GCTTTAAGAT AGGATTATTG 1986 

GGGCTTACCC CCACCTTAAT TAGAGAAACA TTTATATTGC TTACTACTGT AGGCTGTACA 2046 

TCTCTTTTCC GATTTTTGTA TAATGATGTA AACATGGAAA AACTTTAGGA AATGCACTTA 2106 

TTAGGCTGTT TACATGGGTT GCCTGGATAC AAATCAGCAG TCAAAAATGA CTAAAAATAT 2166 

AACTAGTGAC GGAGGGAGAA ATCCTCCCTC TGTGGGAGGC ACTTACTGCA TTCCAGTTCT 2226 

CCCTCCTGCG CCCTGAGACT GGACCAGGGT TTGATGGCTG GCAGCTTCTC AAGGGGCAGC 2286 

TTGTCTTACT TGTTAATTTT AGAGGTATAT AGCCATATTT ATTTATAAAT AAATATTTAT 2346 

TTATTTATTT ATAAGTAGAT GTTTACATAT GCCCAGGATT TTGAAGAGCC TGGTATCTTT 2406 

GGGAAGCCAT GTGTCTGGTT TGTCGTGCTG GGACAGTCAT GGGACTGCAT CTTCCGACTT 2466 

GTCCACAGCA GATGAGGACA GTGAGAATTA AGTTAGATCC GAGACTGCGA AGAGCTTCTC 2526 

TTTCAAGCGC CATTACAGTT GAACGTTAGT GAATCTTGAG CCTCATTTGG GCTCAGGGCA 2586 

GAGCAGGTGT TTATCTGCCC CGGCATCTGC CATGGCATCA AGAGGGAAGA GTGGACGGTG 2646 

CTTGGGAATG GTGTGAAATG GTTGCCGACT CAGGCATGGA TGGGCCCCTC TCGCTTCTGG 2706 

TGGTCTGTGA ACTGAGTCCC TGGGATGCCT TTTAGGGCAG AGATTCCTGA GCTGCGTTTT 2766 

AGGGTACAGA TTCCCTGTTT GAGGAGCTTG GCCCCTCTGT AAGCATCTGA CTCATCTCAG 2826 

AGATATCAAT TCTTAAACAC TGTGACAACG GGATCTAAAA TGGCTGACAC ATTTGTCCTT 2886 

GTGTCACGTT ' CCATTATTTT ATTTAAAAAC CTCAGTAATC GTTTTAGCTT CTTTCCAGCA 294 6 
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y?5 



AACTCTTCTC 


CACAGTAGCC 


CAGTCGTGGT 


AGGATAAATT 


AC GG AT AT AG 


TCATTCTAGG 


3006 


GGTTTCAGTC 


TTTTCCATCT 


CAAGGCATTG 


TGTGTTTTGT 


TCCGGGACTG 


GTTTGGCTGG 


3066 


GACAAAGTTA 


GAACTGCCTG 


AAGTTCGCAC 


ATTCAGATTG 


TTGTGTCCAT 


GGAGTTTTAG 


3126 


GAGGGGATGG 


CCTTTCCGGT 


CTTCGCACTT 


CCATCCTCTC 


CCCACTTCCC 


ATCTGGCGTC 


3186 


CCACACCTTG 


TCCCCCTGCA 

0 


CTTCTGGATG 


ACCAGGGTGC 


TGCTGCCTCC 


TAGTCTTTGC 


3246 


CTTTGCTGGG 


CCTTCTGTGC 


AGGAGACTTG 


GTCTCAAAGC 


TCAGAGAGAG 


CCAGTCCGGT 


3306 


CCCAGCTCCT 


TTGTCCCTTC 


CTCAGAGGCC 


TTCCTTGAAG 


ATGCATCTAG 


ACTACCAGCC 


3366 


TTATCAGTGT 


TTAAGCTTAT 


TCCTTTAACA 


TAAGCTTCCT 


GACAACATGA 


AATTGTTGGG 


3426 


GTTTTTTGGC 


GTTTGTTGAT 


TTGTTTAGGT 


TTTGCTTTAT 


ACCCGGGCCA 


AATAGCACAT 


3486 


AACACCTGGT 


TATATATGAA 


ATACTCATAT 


GTTTATGACC 


AAAATAAATA 


TGAAACCTCA 


3546 


AAAAAAAAAA 


AAAAAAAAAA 










3566 



(2) INFORMATION FOR SEQ ID NO: 2: 

? i 

J~ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 6 amino acids 
E3 (B) TYPE: amino acid 

\Q (D) TOPOLOGY: linear 

Cj (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

« Met Gly Leu Trp Gly , Gin Ser Val Pro Thr Ala Ser Ser Ala Arg Ala 
-55 -50 -45 -40 

Gly Arg Tyr Pro Gly Ala Arg Thr Ala Ser Gly Thr Arg Pro Trp Leu 
-35 -30 -25 

Leu Asp Pro Lys He Leu Lys Phe Val Val Phe He Val Ala Val Leu 
-20 -15 -10 

Leu Pro Val Arg Val Asp Ser Ala Thr He Pro Arg Gin Asp Glu Val 
"5 1 5 

Pro Gin Gin Thr Val Ala Pro Gin Gin Gin Arg Arg Ser Leu Lys Glu 
10 .15 20 25 

Glu Glu Cys Pro Ala Gly Ser His Arg Ser Glu Tyr Thr Gly Ala Cys 
30 35 40 

Asn Pro Cys Thr Glu Gly Val Asp Tyr Thr He Ala Ser Asn Asn Leu 
45 50 55 
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Pro Ser Cys Leu Leu Cys Thr Val Cys Lys- Ser Gly Gin Thr Asn Lys 

65 70 

Ser Ser Cys Thr Thr Thr Arg Asp Thr Val Cys Gin Cys Glu Lys Gly 
3 80 85 

Ser Phe Gin Asp Lys Asn Ser Pro Glu Met Cys Arg Thr Cys Arg Thr 

95 100 105 

Gly Cys Pro Arg Gly Met Val Lys Val Ser Asn Cys Thr Pro. Arg Ser 

110 H5 120 

. Asp lie Lys Cys Lys Asn Glu Ser Ala Ala Ser Ser Thr Gly Lys Thr 
5 130 135 

Pro Ala Ala Glu Glu Thr Val Thr Thr He Leu Gly Met Leu Ala Ser 
lqu 145 150 

Pro Tyr His Tyr Leu lie lie lie Val Val Leu Val lie lie Leu Ala 
' 0:5 160 165 

Val Val Val Val Gly Phe Ser Cys Arg Lys Lys Phe lie Ser Tyr Leu 

175 180 i 8 5 

Lys Gly lie Cys Ser Gly Gly Gly Gly Gly Pro Glu Arg Val His Arg 
190 195 200 

Val Leu Phe Arg Arg Arg Ser Cys Pro Ser Arg Val Pro Gly Ala Glu 
205 210 215 

Asp Asn Ala Arg Asn Glu Thr Leu Ser -Asn Arg Tyr Leu Gin Pro Thr 
220 225 230 

Gin Val Ser Glu Gin Glu lie Gin Gly Gin Glu Leu Ala Glu Leu Thr 

240 245 

Gly Val Thr Val Glu Ser Pro Glu Glu Pro Gin Arg Leu Leu Glu Gin 

" 260 265 

Ala Glu Ala Glu Gly Cys Gin Arg Arg Arg Leu Leu Val Pro Val Asn 
270 275 280 

Asp Ala Asp Ser Ala Asp lie Ser Thr Leu Leu Asp Ala Ser Ala Thr 
285 290 295 

Leu Glu Glu Gly His Ala Lys Glu Thr lie Gin Asp Gin Leu Val Gly 
300 305 310 

Ser Glu Lys Leu Phe Tyr Glu Glu Asp Glu Ala Gly Ser Ala Thr Ser • 
JJ -° 320 325 

Cys Leu 
330 



(2) INFORMATION FOR SEQ ID NO:3: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 331 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

Met Leu Gly lie Trp Thr Leu Leu Pro Leu Val Leu Thr Ser Val Ala 
1 5 10 15 ' 

Arg Leu Ser Ser Lys Ser Val Asn Ala Gin Val Thr Asp lie Asn Ser 
20 25 30 

Lys Gly Leu Glu Leu Arg Lys Thr Val Thr Val Glu Thr Gin Asn Leu 
35 40 45 

Glu Gly Leu His His Asp Gly Gin Phe Cys His Pro Cys Pro Pro Gly 
50 55 60 

Glu Arg Lys Ala Arg Asp Cys Thr Val Asn Gly Asp Glu Pro Asp Cys 

70 75 so 

Val Pro Cys Gin Glu Gly Lys Glu Tyr Thr Asp Lys Ala His Phe Ser 
85 go 95 

Ser Lys Cys Arg Arg Cys Arg Leu Cys Asp Glu Gly His Gly Leu Glu 
100 105 no 

Val Glu lie Asn Cys Thr Arg Thr Gin Asn Thr Lys Cys Arg Cys Lys 
I 15 120 125 

Pro Asn Phe Phe Cys Asn Ser Thr Val Cys Glu His Cys Asp Pro Cys 
1JU 135 140 



Thr Lys Cys Glu His Gly He lie Lys Glu Cys Thr Leu Thr Ser Asn 
145 150 155 i 6 o 

Thr Lys Cys Lys Glu Glu Gly Ser Arg Ser Asn Gly Trp Leu Cys Leu 
165 170 n«t 



Leu Leu Leu Pro lie Pro Leu lie Val Trp Val Lys Arg Lys Glu Val 
180 185 190 

Gin Lys Thr Cys Arg Lys His Arg Lys Glu Asn Gin Gly Ser His Glu 

195 200 205 

Ser Pro Thr Leu Asn Pro Glu Thr Val Ala lie Asn Leu Ser Asp Val 
210 215 220 

Asp Leu Ser Lys Tyr lie Thr Thr lie Ala Gly Val Met. Thr Leu Ser 
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225 230 235 



240 



Gin Val Lys Gly Phe Val Arg Lys Asn Gly Val Asn Glu Ala Lys lie 
2-4.5 250 255 

Asp Glu lie Lys Asn Asp Asn Val Gin Asp Thr Ala Glu Gin Lys Val 
260 265 270 

Gin Leu Leu Arg Asn Trp His Gin Leu His Gly Lys Lys Glu Ala Tyr 
275 280 285 

Asp Thr Leu He Lys Asp Leu Lys Lys Ala Asn Leu Cys Thr Leu Ala 
290 295 300 

Glu Lys He Thr He He Leu Lys Asp He Thr Ser Asp Ser Glu Asn 
305 310. 315 320 



£3 Ser Asn Phe Arg Asn Glu He Gin Ser Leu Val 

O 325 330 

Li! 

• K Q {2) INFORMATION FOR SEQ ID NO: 4: 

u (1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 427 amino acids 
! * (B) TYPE: amino acid 

2 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

~l I 

| i_i 

□ (ii) MOLECULE TYPE: protein 

03 

S3. 

W 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Gly Ala Gly Ala Thr Gly Arg Ala Met Asp Gly Pro Arg Leu Leu 
1 5 io 15 

Leu Leu Leu Leu Leu Gly Val Ser Leu Gly Gly Ala Lys Glu Ala Cys 
20 25 30 

Pro Thr Gly Leu Tyr Thr His Ser Gly Glu Cys Cys Lys Ala Cys Asn 
35 40 45 

Leu Gly Glu Gly Val Ala Gin Pro Cys Gly Ala Asn Gin Thr Val Cys 
50 .55 60 

Glu Pro Cys Leu Asp Ser Val Thr Phe Ser Asp Val Val Ser Ala Thr 
65 70 75 80 

Glu Pro Cys Lys Pro Cys Thr Glu Cys Val Gly Leu Gin Ser Met Ser 
85 • 90 95 

Ala Pro Cys Val Glu Ala Asp Asp Ala Val Cys Arg Cys Ala Tyr Gly 
100 105 HO 



Tyr Tyr Gin Asp Glu 
115 



Thr Thr Gly Arg Cys Glu Ala Cys Arg Val Cys 



120 125 



Glu Ala Gly Ser Gly 
130 



Leu Val Phe Ser Cys Gin Asp Lys Gin Asn Thr 




Val Cys Glu Glu Cys 
145 



Pro Asp Gly Thr Tyr Ser Asp Glu Ala Asn His 



155 160 



Val Asp Pro Cys Leu Pro Cys Thr Val Cys Glu Asp Thr Glu Arg Gin 
165 170 175 

Leu Arg Glu Cys Thr Arg Trp Ala Asp Ala Glu Cys Glu Glu He Pro 
180 185 190 

Gly Arg Trp He Thr Arg Ser Thr Pro Pro Glu Gly Ser Asp Ser Thr 
195 200 205 

Ala Pro Ser Thr Gin Glu Pro Glu Ala Pro Pro Glu Gin Asp Leu He 
210 215 220 

Ala Ser Thr Val Ala Gly Val Val Thr Thr Val Met Gly Ser Ser Gin 
225 230 235 240 

Pro Val Val Thr Arg Gly Thr Thr Asp Asn Leu He Pro Val Tyr Cys 
245 250 255 

Ser He Leu Ala Ala Val Val Val Gly Leu Val Ala Tyr He Ala Phe 
260 265 270 

Lys Arg Trp Asn Ser Cys Lys Gin Asn Lys Gin Gly Ala Asn Ser Arg 
275 280 285 

Pro Val Asn Gin Thr Pro Pro Pro Glu Gly Glu Lys Leu His Ser Asp 
290 295 300 

Ser Gly lie Ser Val Asp Ser Gin Ser Leu His Asp Gin Gin Pro His 
305 310 315 320 

Thr Gin Thr Ala Ser Gly Gin Ala Leu Lys Gly Asp Gly Gly Leu Tyr 



Ser Ser Leu Pro Pro Ala Lys Arg Glu Glu Val Glu Lys Leu Leu Asn 
340 345 350 

Gly Ser Ala Gly Asp Thr Trp Arg His Leu Ala Gly Glu Leu Gly Tyr 
355 360 365 

Gin Pro Glu His lie Asp Ser Phe Thr His Glu Ala Cys Pro Val Arg 
370 375 * 380 

Ala Leu Leu Ala Ser Trp Ala Thr Gin Asp Ser Ala Thr Leu Asp Ala 
385 390 395 " 400 

Leu Leu Ala Ala Leu Arg Arg lie Gin Arg Ala Asp Leu Val Glu Ser 



325 



330 



335 



405 



410 



415 
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Leu Cys Ser Glu Ser Thr Ala Thr Ser Pro Val 
420 425 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 453 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Gly Leu Ser Thr Val Pro Asp Leu Leu Leu Pro Leu Val Leu Glu 
1 5 io 15 

Leu Leu Val Gly lie Tyr Pro Ser Gly Val He Gly Leu Val Pro His 
20 25 30 

Leu Gly Asp Arg Glu Lys Arg Asp Ser Val Cys Pro Gin Gly Lys Tyr 

He His Pro Asn Asn Ser lie Cys Cys Thr Lys Cys His Lys Gly Thr 

55 go 



Tyr Leu Tyr Asn Asp Cys Pro Gly Pro Gly Gin Asp Thr Asp Cys Arg 

70 75 80 

Glu Cys Glu Ser Gly Ser Phe Thr Ala Ser Glu Asn His Leu Arg His 

Cys Leu Ser Cys Ser Lys Cys Arg Lys Glu Met Gly Gin Val Glu lie 

Ser Ser Cys Thr Val Asp Arg Asp Thr Val Cys Gly Cys Arg Lys Asn 

120 125 
Gin Tyr Arg His Tyr Trp Ser Glu Asn Leu Phe Gin Cys P he Asn Cys 

Ser Leu Cys Leu Asn Gly Thr Val His Leu Ser Cys Gin Glu Lys Gin 

50 15 5 160 

Asn Thr Val Cys Thr Cys His Ala Gly Phe Phe Leu Arg Glu Asn Glu 
165 17 ° 175 

Cys Val Ser Cys Ser Asn Cys Lys Lys Ser Leu Glu Cys Thr Lys Leu 

185 190 

Cys Leu Pro Gin lie Glu Asn Val Lys Gly Thr Glu Asp Ser Gly 



Thr 
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m 
a 



Fi ! 



195 200 205 

Thr Val Leu Leu Pro Leu Val lie Phe Phe Gly Leu Cys Leu Leu Ser 
210 - 215 220 

Leu Leu Phe He Gly Leu Met Tyr Arg Tyr Gin Arg Trp Lys Ser Lys 
225 2 30 235 240 

Leu Tyr Ser He Val Cys Gly Lys Ser Thr Pro Glu Lys Glu Gly Glu' 
245 250 255 

Leu Glu Gly Thr Thr Thr Lys Pro Leu Ala Pro Asn Pro Ser Phe Ser 
2 60 265 270 

Pro Thr Pro Gly Phe Thr Pro Thr Leu Gly Phe Ser Pro Val Pro Ser 
275 280 285 



Ser Thr Phe Th * Ser Ser Ser Thr Tyr Thr Pro Gly Asp Cys Pro Asn 
290 295 . 300 

Phe Ala Ala Pro Arg Arg Glu Val Ala Pro Pro Tyr Gin Gly Ala Asd 
3 °5 310 315 320 

Pro He Leu Ala Thr Ala Leu Ala Ser Asp Pro He Pro Asn Pro Leu 
325 330 335 

LT Gln L ^ s Tr P Glu As P Ser Ala His Lys Pro Gin Ser Leu Asp Thr Asp 

Lr 340 345 350 

Asp Pro Ala Thr Leu Tyr Ala Val Val Glu Asn Val Pro Pro Leu Arg 
355 360 365 

Trp Lys Glu Phe Val Arg Arg Leu Gly Leu Ser Asp His Glu He Asp 
370 375 380 

Arg Leu Glu Leu Gin Asn Gly Arg Cys Leu Arg Glu Ala Gin Tyr Ser 
385 390 395 400 

Met Leu Ala Thr Trp Arg Arg Arg Thr Pro Arg Arg Glu Ala Thr Leu 
405 410 415 

Glu Leu Leu Gly Arg Val Leu Arg Asp Met Asp Leu Leu Gly Cys Leu 
420 425 430 

Glu Asp He Glu Glu Ala Leu Cys Gly Pro Ala Ala Leu Pro Pro Ala 
435 440 445 

Pro Ser Leu Leu Arg 
450 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 4 67 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Pro Pro Pro Ala Arg Val His Leu Gly Ala Phe Leu Ala Val 
1 5 10 15 

Thr Pro Asn Pro Gly Ser Ala Ala Ser Gly Thr Glu Ala Ala Ala Ala 
20 25 30 

Thr Pro Ser Lys Val Trp Gly Ser Ser Ala Gly Arg lie Glu Pro Arg 
35 40 45 

Gly Gly Gly Arg Gly. Ala Leu Pro Thr Ser Met Gly Gin His Gly Pro 
50 55 go 

Ser Ala Arg Ala. Arg Ala Gly Arg Ala Pro Gly Pro Arg Pro Ala Arg 
65 70 75 80 

Glu Ala Ser Pro Arg Leu Arg Val His Lys Thr Phe Lys Phe Val Val 
85 90 95 

Val Gly Val Leu Leu Gin Val Val Pro Ser Ser Ala Ala Thr He Lys 
100 105 ■ no 

Leu His Asp Gin Ser He Gly Thr Gin Gin Trp Glu His Ser Pro Leu 
115 120 125 

Gly Glu Leu Cys Pro Pro Gly Ser His Arg Ser Glu Arg Pro Gly Ala 
130 135 .140 • 

Cys Asn Arg Cys Thr Glu Gly Val Gly Tyr Thr Asn Ala Ser Asn Asn 
145 150 155 160 

Leu Phe Ala Cys Leu Pro Cys Thr Ala Cys Lys Ser Asp Glu Glu Glu 
165 170 « 175 

Arg Ser Pro Cys Thr Thr Thr Arg Asn Thr Ala Cys Gin Cys Lys Pro 
180 185 190 

Gly Thr Phe Arg Asn Asp Asn Ser Ala Glu Met Cys Arg Lys Cys Ser 
. 195 200 205 

Thr Gly Cys Pro Arg Gly Met Val Lys Val Lys Asp Cys Thr Pro Trp 
210 215 220 

Ser Asp He Glu Cys Val His Lys Glu Ser Gly Asn Gly His Asn He 
225 230 235 240 

Trp Val He Leu Val Val Thr Leu Val Val Pro Leu Leu Leu Val Ala • 
245 250 255 
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Val Leu He Val Cys Cys 
2 60 

Lys Cys Met Asp Arg Val 
275 

Pro Gly Ala Glu Asp Asn 
290 

Ser Leu Ser Thr Phe Val 
305 310 

Ala Asp Leu Thr Gly Val 
325 

Leu Gly Pro Ala Glu Ala 
340 

Pro Ala Asn Gly Ala Asp 
355 

Lys Phe Ala Asn He Val 
370 

Gin Leu Asp Leu Thr Lys 
385 390 

Ala Gly Pro Gly Asp Ala 
405 

Lys Thr Gly Arg Asn Ala 
420 

Arg Met Glu Glu Arg His 
435 

Asp Ser Gly Lys Phe He 
450 

Ser Leu Glu 
465 



Cys He Gly Ser Gly Cys Gly Gly Asp Pro 
265 270 

Cys Phe Trp Arg Leu Gly Leu Leu Arg Gly 
280 285 

Ala His Asn Glu He Leu Ser Asn Ala Asd 
295 300 

Ser Glu Gin Gin Met Glu Ser Gin Glu Pro 
315 320 

Val Gin Ser Pro Gly Glu Ala Gin Cys Leu 
330 335 

Glu Gly Ser Gin Arg Arg Arg Leu Leu Val 
3.45 350 

Pro Thr Glu Thr Leu Met Leu Phe Phe Asp 
360 365 . . 

Pro Phe Asp Ser Trp Asp Gin Leu Met Arq 
375 380 

Asn Glu He Asp Val Val Arg Ala Gly Thr 
395 400 

Leu Tyr Ala Met Leu Met Lys Trp Val Asn 
410 . 415 

Ser lie His Thr Leu Leu Asp Ala Leu Glu 
425 430 

Ala Lys Glu Lys He Gin Asp Leu Leu Val 
• 440 445 

Tyr Leu Glu Asp Gly Thr Gly Ser Ala Val 
455 460 



INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GTCACGTTCC ATTATTTTAT TTAAAAACCT CAGTAATCGT TTTAGCTTCT TTCCAGCAAA 60 

CTCTTCTCCA CAGTAGCCCA GTCGTGGTAG GATAAATTAC GGATATAGTC ATTCTAGGGG 120 

TTTCAGTCTT TTCCATCTCA AGGCATTGTG TGTTTTGTTC CGGGACTGGT TTGGCTGGGA 180 

CAAAGTTAGA ACTGCCTGAA GTTCGCACAT TCAGATTGTT GTGTCCATGG AGTTTTAGGA 240 

GGGGATGGCC TTTCCGGTCT TCGCACTTCC ATCCTCTCCC ACTTCCATCT GGCGTCCACA 300 

ACTTGTCCCC TGCACTTCTG GATGACACAG GGTGCTGCTG CCT 343 

(2) INFORMATION ' FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 279 base pairs 
O (B) TYPE: nucleic acid 

£H (C) STRANDEDNESS: single 

Ul' (D) TOPOLOGY: linear 



sy 



(ii) MOLECULE TYPE:. DNA (genomic) 



fU (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

GTGGACGGTG CTTGGGAATG GTGTGAAATG GTTGCCGACT CAGGCATGGA TGGGCCCCTC 60 

TCGCTTCTGG TGGTCTGTGA ACTGAGTCCC TGGGATGCCT TTAGGGCAGA GATTCCTGAG 120 

CTGCGTTTTA GGGTACAGAT TCCCTGTTTG AGGAGCTTGG CCCCTCTGTA AGCGTCTGAC 180 

TCATCTCAGA GATATCAATT CTTAAACACT GTGACAACGG GATCTAAAAT GGCTGACACA 240 

TTTGTCCTTG TGTCACGTTC CATTATTTTA TTTAAAATT 27 9 
(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 250 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 
GGCCACGTAG TGCCACGTGC CACAAACTAC GGGGGACGAT TTCTGATTGA ATTTTTGGCG 
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CTTTCAATCC ACCCTCCTCC C.TTCTAATGG GACTTTGGGG ACAAAGGTCC GACCGCCTCG 
AGCGTCGACA GGGCGCTATC CAGGAGCCAG GACAGCGTCG GGAACCAGAC CATGGCTCCT 
GGACCCCAAG ATCCTTAAGT TCGTCGTCTT CATCGTCGGG TTCTCTGCCG GTAAGTTAGG 
AGGTCCCTGG 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CGCCCATGGC CACCATCCCC CGGCAG 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CGCAAGCTTT TAGTAGTGAT AGGGAGAGGC 
(2) INFORMATION FOR . SEQ ID NO:12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CGCGGATCCG CCATCATGGG ACTTTGGGGA CAA 
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(2) INFORMATION FOR SEQ ID NO:13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CGCGGTACCT TAG TAG T GAT AGGGAGAGGC 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 58 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CGCTCTAGAT CAAGCGTAGT CTGGGACGTC GTATGGGTAG TAAGTGATAG GGAGAGGC 58 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 408 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GAGTTTGACC AGAGATGCAA GGGGTGAAGG AGCGCTTCCT ACCGTTAGGA ACTCTGGGGA 60 

CAGAGCGCCC CGGCCGCCTG ATGGCGAGGC AGGGTGCGAC CCAGGACCCA GGACGGCGTC 120 

GGGAACCATA CCATGGCCCG GATCCCCAAG ACCCTAAAGT TCGTCGTCGT CATCGTCGCG 180 

GTCCTGCTGC CAGTCCTAGC TTACTCTGCC ACCACTGCCC GGCAGAGGGA AGTTCCCCAG -250 

CAGACAGTGG CCCCACAGCA ACAGAGGCAC AGCTTCAAGG GGGAGGAGTG TCCAGCAGGA 310 

TCTCATAGAT CAGAACATAC TGGAGCCTGT AACCCGTGCA CAGAGGGTGT GGATTACACC 370 

AACGCTTCCA ACAATGAACC TTCTTGCTTC CCATGTAC 408 



